Goto

Collaborating Authors

 guy handle


How do you guys handle 'garbage data' discovered during ETL? • r/Database

#artificialintelligence

Hopefully my title isn't too poorly worded.. I am currently upgrading a client's old transaction-based DB to something a bit more modern that locks down their flow a bit better so problems like this hopefully don't arise in the future. To give a brief overview, they use this to track hours on tubes and capacitors used in transmitters to calculate average lifespans and perform other calculations. Devices are tied to meters whose readings are updated daily. I've got the old data transformed and loaded into the new system, but running through some basic sanity checks I'm finding there is quite a bit of data that simply doesn't make sense. On a very basic level, there are transactions with IN Dates that are higher values than OUT Dates.